{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# CHAPTER 4 \n",
    "# NumPy Basics: Arrays and Vectorized Computation\n",
    "\n",
    "在数值计算领域，说Numpy是python最重要的包也不为过。在numpy中有下面这些东西：\n",
    "\n",
    "- ndarray, 一个有效的多维数组，能提供以数组为导向的快速数值计算和灵活的广播功能（broadcasting）\n",
    "\n",
    "- 便利的数学函数\n",
    "\n",
    "- 用于读取/写入(reading/writing)数据到磁盘的便利工具\n",
    "\n",
    "- 线性代数，随机数生成，傅里叶变换能力\n",
    "\n",
    "- 可以用C API来写C，C++，或FORTRAN\n",
    "\n",
    "通过学习理解numpy中数组和数组导向计算，能帮我们理解pandas之类的工具。\n",
    "\n",
    "# 4.1 The NumPy ndarray: A Multidimensional Array Object（ndarray: 多维数组对象）\n",
    "\n",
    "N-dimensional array object（n维数组对象）, or ndarray，这是numpy的关键特征。先来尝试一下，生成一个随机数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Generate some random data\n",
    "data = np.random.randn(2, 3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[-0.35512366, -0.63779545,  0.14137933],\n",
       "       [ 0.36642056,  0.30898139, -0.87040292]])"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "进行一些数学运算："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[-3.55123655, -6.37795453,  1.41379333],\n",
       "       [ 3.66420556,  3.0898139 , -8.70402916]])"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data * 10"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[-0.71024731, -1.27559091,  0.28275867],\n",
       "       [ 0.73284111,  0.61796278, -1.74080583]])"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data + data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "每一个数组都有一个shape，来表示维度大小。而dtype，用来表示data type："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(2, 3)"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dtype('float64')"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data.dtype"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 1 Greating ndarrays (创建n维数组)\n",
    "\n",
    "最简单的方法使用array函数，输入一个序列即可，比如list："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 6. ,  7.5,  8. ,  0. ,  1. ])"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data1 = [6, 7.5, 8, 0, 1]\n",
    "arr1 = np.array(data1)\n",
    "arr1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "嵌套序列能被转换为多维数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 2, 3, 4],\n",
       "       [5, 6, 7, 8]])"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data2 = [[1, 2, 3, 4], [5, 6, 7, 8]]\n",
    "arr2 = np.array(data2)\n",
    "arr2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "因为data2是一个list of lists, 所以arr2维度为2。我们能用ndim和shape属性来确认一下："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr2.ndim"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(2, 4)"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr2.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "除非主动声明，否则np.array会自动给data搭配适合的类型，并保存在dtype里："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dtype('float64')"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr1.dtype"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dtype('int64')"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr2.dtype"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "除了np.array，还有一些其他函数能创建数组。比如zeros,ones,另外还可以在一个tuple里指定shape："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.zeros(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.,  0.,  0.,  0.,  0.,  0.],\n",
       "       [ 0.,  0.,  0.,  0.,  0.,  0.],\n",
       "       [ 0.,  0.,  0.,  0.,  0.,  0.]])"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.zeros((3, 6))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[[  0.00000000e+000,   0.00000000e+000],\n",
       "        [  2.16538378e-314,   2.16514681e-314],\n",
       "        [  2.16511832e-314,   2.16072529e-314]],\n",
       "\n",
       "       [[  0.00000000e+000,   0.00000000e+000],\n",
       "        [  2.14037397e-314,   6.36598737e-311],\n",
       "        [  0.00000000e+000,   0.00000000e+000]]])"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.empty((2, 3, 2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "np.empty并不能保证返回所有是0的数组，某些情况下，会返回为初始化的垃圾数值，比如上面。\n",
    "\n",
    "arange是一个数组版的python range函数："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.arange(15)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这里是一些创建数组的函数：\n",
    "\n",
    "![](../MarkdownPhotos/chp04/屏幕快照 2017-10-24 下午1.04.36.png)\n",
    "\n",
    "![](../MarkdownPhotos/chp04/屏幕快照 2017-10-24 下午1.04.36_cn.png)\n",
    "\n",
    "# 2 Data Types for ndarrays\n",
    "\n",
    "dtype保存数据的类型："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "arr1 = np.array([1, 2, 3], dtype=np.float64)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "arr2 = np.array([1, 2, 3], dtype=np.int32)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dtype('float64')"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr1.dtype"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dtype('int32')"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr2.dtype"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "dtype才是numpy能灵活处理其他外界数据的原因。\n",
    "\n",
    "类型表格：\n",
    "\n",
    "![](../MarkdownPhotos/chp04/屏幕快照 2017-10-24 下午1.15.52.png)\n",
    "\n",
    "![](../MarkdownPhotos/chp04/屏幕快照 2017-10-24 下午1.15.52_cn.png)\n",
    "\n",
    "可以用astype来转换类型："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dtype('int64')"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr = np.array([1, 2, 3, 4, 5])\n",
    "arr.dtype"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dtype('float64')"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "float_arr = arr.astype(np.float64)\n",
    "float_arr.dtype"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "上面是把int变为float。如果是把float变为int，小数点后的部分会被丢弃："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([  3.7,  -1.2,  -2.6,   0.5,  12.9,  10.1])"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr = np.array([3.7, -1.2, -2.6, 0.5, 12.9, 10.1])\n",
    "arr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 3, -1, -2,  0, 12, 10], dtype=int32)"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr.astype(np.int32)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "还可以用astype把string里的数字变为实际的数字："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([b'1.25', b'-9.6', b'42'], \n",
       "      dtype='|S4')"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "numeric_strings = np.array(['1.25', '-9.6', '42'], dtype=np.string_)\n",
    "numeric_strings"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([  1.25,  -9.6 ,  42.  ])"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "numeric_strings.astype(float)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "要十分注意`numpy.string_`类型，这种类型的长度是固定的，所以可能会直接截取部分输入而不给警告。\n",
    "\n",
    "如果转换（casting）失败的话，会给出一个ValueError提示。\n",
    "\n",
    "可以用其他数组的dtype直接来制定类型："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "int_array = np.arange(10)\n",
    "\n",
    "calibers = np.array([.22, .270, .357, .380, .44, .50], dtype=np.float64)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.])"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "int_array.astype(calibers.dtype)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "还可以利用类型的缩写，比如u4就代表unit32："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0, 0, 0, 0, 0, 0, 0, 0], dtype=uint32)"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "empty_unit32 = np.empty(8, dtype='u4')\n",
    "empty_unit32"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "记住，astype总是会返回一个新的数组\n",
    "\n",
    "# 3 Arithmetic with NumPy Arrays（数组计算）\n",
    "\n",
    "数组之所以重要，是因为不用写for循环就能表达很多操作，这种特性叫做vectorization(向量化)。任何两个大小相等的数组之间的运算，都是element-wise（点对点）："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "arr = np.array([[1., 2., 3.], [4., 5., 6.]])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1.,  2.,  3.],\n",
       "       [ 4.,  5.,  6.]])"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[  1.,   4.,   9.],\n",
       "       [ 16.,  25.,  36.]])"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr * arr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.,  0.,  0.],\n",
       "       [ 0.,  0.,  0.]])"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr - arr"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "element-wise 我翻译为点对点，就是指两个数组的运算，在同一位置的元素间才会进行运算。\n",
    "\n",
    "这种算数操作如果涉及标量（scalar）的话，会涉及到数组的每一个元素：\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1.        ,  0.5       ,  0.33333333],\n",
       "       [ 0.25      ,  0.2       ,  0.16666667]])"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "1 / arr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 1.        ,  1.41421356,  1.73205081],\n",
       "       [ 2.        ,  2.23606798,  2.44948974]])"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr ** 0.5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "两个数组的比较会产生布尔数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[  0.,   4.,   1.],\n",
       "       [  7.,   2.,  12.]])"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr2 = np.array([[0., 4., 1.], [7., 2., 12.]])\n",
    "arr2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[False,  True, False],\n",
       "       [ True, False,  True]], dtype=bool)"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr2 > arr"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 4 Basic Indexing and Slicing（基本的索引和切片）\n",
    "\n",
    "一维的我们之前已经在list部分用过了，没什么不同："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])"
      ]
     },
     "execution_count": 51,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr = np.arange(10)\n",
    "arr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr[5]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([5, 6, 7])"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr[5:8]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "arr[5:8] = 12"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0,  1,  2,  3,  4, 12, 12, 12,  8,  9])"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这里把12赋给`arr[5:8]`，其实用到了broadcasted（我觉得应该翻译为广式转变）。这里有一个比较重要的概念需要区分，python内建的list与numpy的array有个明显的区别，这里array的切片后的结果只是一个views（视图），用来代表原有array对应的元素，而不是创建了一个新的array。但list里的切片是产生了一个新的list："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([12, 12, 12])"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr_slice = arr[5:8]\n",
    "arr_slice"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "如果我们改变arr_slice的值，会反映在原始的数组arr上："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "arr_slice[1] = 12345"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([    0,     1,     2,     3,     4,    12, 12345,    12,     8,     9])"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`[:]`这个赋值给所有元素："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "arr_slice[:] = 64"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0,  1,  2,  3,  4, 64, 64, 64,  8,  9])"
      ]
     },
     "execution_count": 62,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "之所以这样设计是出于性能和内存的考虑，毕竟如果总是复制数据的话，会很影响运算时间。当然如果想要复制，可以使用copy()方法，比如`arr[5:8].copy()`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在一个二维数组里，单一的索引指代的是一维的数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([7, 8, 9])"
      ]
     },
     "execution_count": 63,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])\n",
    "arr2d[2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "有两种方式可以访问单一元素："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3"
      ]
     },
     "execution_count": 64,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr2d[0][2]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3"
      ]
     },
     "execution_count": 65,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr2d[0, 2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们可以把下图中的axis0看做row（行），把axis1看做column（列）：\n",
    "\n",
    "![](../MarkdownPhotos/chp04/屏幕快照 2017-10-24 下午2.08.18.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "对于多维数组，如果省略后面的索引，返回的将是一个低纬度的多维数组。比如下面一个2 x 2 x 3数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[[ 1,  2,  3],\n",
       "        [ 4,  5,  6]],\n",
       "\n",
       "       [[ 7,  8,  9],\n",
       "        [10, 11, 12]]])"
      ]
     },
     "execution_count": 66,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])\n",
    "arr3d"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "arr3d[0]是一个2x3数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 2, 3],\n",
       "       [4, 5, 6]])"
      ]
     },
     "execution_count": 67,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr3d[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "标量和数组都能赋给arr3d[0]:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[[42, 42, 42],\n",
       "        [42, 42, 42]],\n",
       "\n",
       "       [[ 7,  8,  9],\n",
       "        [10, 11, 12]]])"
      ]
     },
     "execution_count": 68,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "old_values = arr3d[0].copy()\n",
    "\n",
    "arr3d[0] = 42\n",
    "\n",
    "arr3d"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[[ 1,  2,  3],\n",
       "        [ 4,  5,  6]],\n",
       "\n",
       "       [[ 7,  8,  9],\n",
       "        [10, 11, 12]]])"
      ]
     },
     "execution_count": 69,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr3d[0] = old_values\n",
    "arr3d"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`arr3d[1, 0]`会给你一个(1, 0)的一维数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([7, 8, 9])"
      ]
     },
     "execution_count": 70,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr3d[1, 0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "上面的一步等于下面的两步："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 7,  8,  9],\n",
       "       [10, 11, 12]])"
      ]
     },
     "execution_count": 71,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = arr3d[1]\n",
    "x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([7, 8, 9])"
      ]
     },
     "execution_count": 72,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "一定要牢记这些切片后返回的数组都是views\n",
    "\n",
    "## Indexing with slices（用切片索引）\n",
    "\n",
    "一维的话和python里的list没什么差别："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0,  1,  2,  3,  4, 64, 64, 64,  8,  9])"
      ]
     },
     "execution_count": 75,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 1,  2,  3,  4, 64])"
      ]
     },
     "execution_count": 76,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr[1:6]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "二维的话，数组的切片有点不同："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 77,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 2, 3],\n",
       "       [4, 5, 6]])"
      ]
     },
     "execution_count": 77,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr2d[:2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "可以看到，切片是沿着axis 0（行）来处理的。所以，数组中的切片，是要沿着设置的axis来处理的。我们可以把arr2d[:2]理解为“选中arr2d的前两行”。\n",
    "\n",
    "当然，给定多个索引后，也可以使用复数切片："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 2, 3],\n",
       "       [4, 5, 6],\n",
       "       [7, 8, 9]])"
      ]
     },
     "execution_count": 79,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr2d"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[2, 3],\n",
       "       [5, 6]])"
      ]
     },
     "execution_count": 81,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr2d[:2, 1:] # 前两行，第二列之后"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "记住，选中的是array view。通过混合整数和切片，能做低维切片。比如，我们选中第二行的前两列："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 82,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([4, 5])"
      ]
     },
     "execution_count": 82,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr2d[1, :2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "选中第三列的前两行："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 83,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([3, 6])"
      ]
     },
     "execution_count": 83,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr2d[:2, 2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "冒号表示提取整个axis（轴）："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 84,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1],\n",
       "       [4],\n",
       "       [7]])"
      ]
     },
     "execution_count": 84,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr2d[:, :1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "看图示有助于理解：\n",
    "![](../MarkdownPhotos/chp04/屏幕快照 2017-10-24 下午2.41.52.png)\n",
    "\n",
    "赋值也很方便："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 85,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 0, 0],\n",
       "       [4, 0, 0],\n",
       "       [7, 8, 9]])"
      ]
     },
     "execution_count": 85,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr2d[:2, 1:] = 0\n",
    "arr2d"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 5 Boolean Indexing (布尔索引)\n",
    "\n",
    "假设我们的数组数据里有一些重复。这里我们用numpy.random里的randn函数来随机生成一些离散数据："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 87,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'], \n",
       "      dtype='<U4')"
      ]
     },
     "execution_count": 87,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])\n",
    "names"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 93,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.02584271, -1.53529621,  0.73143988, -0.34086189],\n",
       "       [ 0.40864782,  0.53476799,  1.09620596,  0.4846564 ],\n",
       "       [ 1.95024076, -0.37291038, -0.40424703,  0.30297059],\n",
       "       [-0.48632936,  0.63817756, -0.40792716, -1.48037389],\n",
       "       [-0.81976335, -1.10162466, -0.59823212, -0.10926744],\n",
       "       [-0.5212113 ,  0.29449179,  2.0568032 ,  2.00515735],\n",
       "       [-2.36066876, -0.3294302 , -0.24464646, -0.81432884]])"
      ]
     },
     "execution_count": 93,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data = np.random.randn(7, 4)\n",
    "data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "假设每一个name对应data数组中的一行，我们想要选中name为'Bob'的所有行。就像四则运算，用比较运算符（==）："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 94,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ True, False, False,  True, False, False, False], dtype=bool)"
      ]
     },
     "execution_count": 94,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "names == 'Bob'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "然后用这个布尔数组当做索引："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 95,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.02584271, -1.53529621,  0.73143988, -0.34086189],\n",
       "       [-0.48632936,  0.63817756, -0.40792716, -1.48037389]])"
      ]
     },
     "execution_count": 95,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data[names == 'Bob']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "注意：布尔数组和data数组的长度要一样。\n",
    "\n",
    "我们可以选中names=='Bob'的行，然后索引列："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 96,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.73143988, -0.34086189],\n",
       "       [-0.40792716, -1.48037389]])"
      ]
     },
     "execution_count": 96,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data[names == 'Bob', 2:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 97,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([-0.34086189, -1.48037389])"
      ]
     },
     "execution_count": 97,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data[names == 'Bob', 3]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "选中除了'Bob'外的所有行，可以用`!=`或者`~`："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 98,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([False,  True,  True, False,  True,  True,  True], dtype=bool)"
      ]
     },
     "execution_count": 98,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "names != 'Bob'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 99,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.40864782,  0.53476799,  1.09620596,  0.4846564 ],\n",
       "       [ 1.95024076, -0.37291038, -0.40424703,  0.30297059],\n",
       "       [-0.81976335, -1.10162466, -0.59823212, -0.10926744],\n",
       "       [-0.5212113 ,  0.29449179,  2.0568032 ,  2.00515735],\n",
       "       [-2.36066876, -0.3294302 , -0.24464646, -0.81432884]])"
      ]
     },
     "execution_count": 99,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data[~(names == 'Bob')]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "当想要反转一个条件时，用`~`操作符很方便："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 100,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "cond = names == 'Bob'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 101,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.40864782,  0.53476799,  1.09620596,  0.4846564 ],\n",
       "       [ 1.95024076, -0.37291038, -0.40424703,  0.30297059],\n",
       "       [-0.81976335, -1.10162466, -0.59823212, -0.10926744],\n",
       "       [-0.5212113 ,  0.29449179,  2.0568032 ,  2.00515735],\n",
       "       [-2.36066876, -0.3294302 , -0.24464646, -0.81432884]])"
      ]
     },
     "execution_count": 101,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data[~cond]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "选中2个或3个名字，组合多个布尔条件，用布尔运算符&，|，另外python中的关键词and和or不管用："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 104,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'], \n",
       "      dtype='<U4')"
      ]
     },
     "execution_count": 104,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "names"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 105,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ True, False,  True,  True,  True, False, False], dtype=bool)"
      ]
     },
     "execution_count": 105,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mask = (names == 'Bob') | (names == 'Will')\n",
    "mask"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 106,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.02584271, -1.53529621,  0.73143988, -0.34086189],\n",
       "       [ 1.95024076, -0.37291038, -0.40424703,  0.30297059],\n",
       "       [-0.48632936,  0.63817756, -0.40792716, -1.48037389],\n",
       "       [-0.81976335, -1.10162466, -0.59823212, -0.10926744]])"
      ]
     },
     "execution_count": 106,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data[mask]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "用布尔索引总是会返回一份新创建的数据，原本的数据不会被改变。\n",
    "\n",
    "更改值的方式也很直觉。比如我们想让所有负数变为0："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 107,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "data[data < 0] = 0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 108,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.02584271,  0.        ,  0.73143988,  0.        ],\n",
       "       [ 0.40864782,  0.53476799,  1.09620596,  0.4846564 ],\n",
       "       [ 1.95024076,  0.        ,  0.        ,  0.30297059],\n",
       "       [ 0.        ,  0.63817756,  0.        ,  0.        ],\n",
       "       [ 0.        ,  0.        ,  0.        ,  0.        ],\n",
       "       [ 0.        ,  0.29449179,  2.0568032 ,  2.00515735],\n",
       "       [ 0.        ,  0.        ,  0.        ,  0.        ]])"
      ]
     },
     "execution_count": 108,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "用一维的布尔数组也能更改所有行或列："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 111,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'], \n",
       "      dtype='<U4')"
      ]
     },
     "execution_count": 111,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "names"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 109,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "data[names != 'Joe'] = 7"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 110,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 7.        ,  7.        ,  7.        ,  7.        ],\n",
       "       [ 0.40864782,  0.53476799,  1.09620596,  0.4846564 ],\n",
       "       [ 7.        ,  7.        ,  7.        ,  7.        ],\n",
       "       [ 7.        ,  7.        ,  7.        ,  7.        ],\n",
       "       [ 7.        ,  7.        ,  7.        ,  7.        ],\n",
       "       [ 0.        ,  0.29449179,  2.0568032 ,  2.00515735],\n",
       "       [ 0.        ,  0.        ,  0.        ,  0.        ]])"
      ]
     },
     "execution_count": 110,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 6 Fancy Indexing(花式索引)\n",
    "\n",
    "通过整数数组来索引。假设我们有一个8 x 4的数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 112,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "arr = np.empty((8, 4))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 113,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "for i in range(8):\n",
    "    arr[i] = i"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 114,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0.,  0.,  0.,  0.],\n",
       "       [ 1.,  1.,  1.,  1.],\n",
       "       [ 2.,  2.,  2.,  2.],\n",
       "       [ 3.,  3.,  3.,  3.],\n",
       "       [ 4.,  4.,  4.,  4.],\n",
       "       [ 5.,  5.,  5.,  5.],\n",
       "       [ 6.,  6.,  6.,  6.],\n",
       "       [ 7.,  7.,  7.,  7.]])"
      ]
     },
     "execution_count": 114,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "想要按一定顺序选出几行，可以用一个整数list或整数ndarray来指定顺序："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 115,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 4.,  4.,  4.,  4.],\n",
       "       [ 3.,  3.,  3.,  3.],\n",
       "       [ 0.,  0.,  0.,  0.],\n",
       "       [ 6.,  6.,  6.,  6.]])"
      ]
     },
     "execution_count": 115,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr[[4, 3, 0, 6]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "用符号来从后选择row："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 116,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 5.,  5.,  5.,  5.],\n",
       "       [ 3.,  3.,  3.,  3.],\n",
       "       [ 1.,  1.,  1.,  1.]])"
      ]
     },
     "execution_count": 116,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr[[-3, -5, -7]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "用多维索引数组，能选出由一维数组中的元素，通过在每个tuple中指定索引："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 118,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0,  1,  2,  3],\n",
       "       [ 4,  5,  6,  7],\n",
       "       [ 8,  9, 10, 11],\n",
       "       [12, 13, 14, 15],\n",
       "       [16, 17, 18, 19],\n",
       "       [20, 21, 22, 23],\n",
       "       [24, 25, 26, 27],\n",
       "       [28, 29, 30, 31]])"
      ]
     },
     "execution_count": 118,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr = np.arange(32).reshape((8, 4))\n",
    "arr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 119,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 4, 23, 29, 10])"
      ]
     },
     "execution_count": 119,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr[[1, 5, 7, 2], [0, 3, 1, 2]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "可以看到`[ 4, 23, 29, 10]`分别对应`(1, 0), (5, 3), (7, 1), (2, 2)`。不论数组有多少维，fancy indexing的结果总是一维。\n",
    "\n",
    "对于长方形区域，有下面的方法来截取："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 120,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 4,  7,  5,  6],\n",
       "       [20, 23, 21, 22],\n",
       "       [28, 31, 29, 30],\n",
       "       [ 8, 11,  9, 10]])"
      ]
     },
     "execution_count": 120,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr[[1, 5, 7, 2]][:, [0, 3, 1, 2]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "上面的意思是，先从arr中选出[1, 5, 7, 2]这四行：\n",
    "\n",
    "    array([[ 4,  5,  6,  7],\n",
    "           [20, 21, 22, 23],\n",
    "           [28, 29, 30, 31],\n",
    "           [ 8,  9, 10, 11]])\n",
    "           \n",
    "然后[:, [0, 3, 1, 2]]表示选中所有行，但是列的顺序要按0,3,1,2来排。于是得到：\n",
    "\n",
    "    array([[ 4,  7,  5,  6],\n",
    "           [20, 23, 21, 22],\n",
    "           [28, 31, 29, 30],\n",
    "           [ 8, 11,  9, 10]])\n",
    "           \n",
    "要记住，fancy indexing和切片不同，得到的是一个新的array。\n",
    "\n",
    "# 7 Transposing Arrays and Swapping Axes（数组转置和轴交换）\n",
    "\n",
    "转置也是返回一个view，而不是新建一个数组。有两种方式，一个是transpose方法，一个是T属性："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 122,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0,  1,  2,  3,  4],\n",
       "       [ 5,  6,  7,  8,  9],\n",
       "       [10, 11, 12, 13, 14]])"
      ]
     },
     "execution_count": 122,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr = np.arange(15).reshape((3, 5))\n",
    "arr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 123,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 0,  5, 10],\n",
       "       [ 1,  6, 11],\n",
       "       [ 2,  7, 12],\n",
       "       [ 3,  8, 13],\n",
       "       [ 4,  9, 14]])"
      ]
     },
     "execution_count": 123,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr.T"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "做矩阵计算的时候，这个功能很常用，计算矩阵乘法的时候，用np.dot:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 130,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[0 2 4 6]\n",
      " [1 3 5 7]]\n",
      "[[0 1]\n",
      " [2 3]\n",
      " [4 5]\n",
      " [6 7]]\n"
     ]
    }
   ],
   "source": [
    "arr = np.arange(8).reshape((4, 2))\n",
    "print(arr.T)\n",
    "print(arr)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 129,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[56, 68],\n",
       "       [68, 84]])"
      ]
     },
     "execution_count": 129,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.dot(arr.T, arr)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "上面的例子是 (2x4) x (4x2) = (2x2)。得到的结果是2x2维，就是普通的矩阵乘法。\n",
    "\n",
    "对于多维数组，transpose会接受由轴数字组成的tuple，来交换轴："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 131,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[[ 0,  1,  2,  3],\n",
       "        [ 4,  5,  6,  7]],\n",
       "\n",
       "       [[ 8,  9, 10, 11],\n",
       "        [12, 13, 14, 15]]])"
      ]
     },
     "execution_count": 131,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr = np.arange(16).reshape((2, 2, 4))\n",
    "arr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 132,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[[ 0,  1,  2,  3],\n",
       "        [ 8,  9, 10, 11]],\n",
       "\n",
       "       [[ 4,  5,  6,  7],\n",
       "        [12, 13, 14, 15]]])"
      ]
     },
     "execution_count": 132,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr.transpose((1, 0, 2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这里，secode axis(1)被设为第一个，first axis(0)第二个，最后的axis没边。\n",
    "\n",
    "使用`.T`来转置swapping axes(交换轴)的一个特殊情况。ndarray有方法叫做swapaxes, 这个方法取两个axis值，并交换这两个轴："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 133,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[[ 0,  1,  2,  3],\n",
       "        [ 4,  5,  6,  7]],\n",
       "\n",
       "       [[ 8,  9, 10, 11],\n",
       "        [12, 13, 14, 15]]])"
      ]
     },
     "execution_count": 133,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 135,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[[ 0,  4],\n",
       "        [ 1,  5],\n",
       "        [ 2,  6],\n",
       "        [ 3,  7]],\n",
       "\n",
       "       [[ 8, 12],\n",
       "        [ 9, 13],\n",
       "        [10, 14],\n",
       "        [11, 15]]])"
      ]
     },
     "execution_count": 135,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "arr.swapaxes(1, 2) # 直交换second axis和last axis"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "swapaxes也是返回view，不生成新的data。"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
